Adaptable data caching mechanism for in-memory cluster computing

ABSTRACT

An in-memory cluster computing framework node is described. The node includes storage devices having various priorities. The node also includes a resource monitor to monitor the operation of the storage devices. The node also includes a resource scheduler. When the resource monitor indicates that a storage device is at or approaching saturation, the resource scheduler can migrate data from that storage device to another storage device of lower priority.

RELATED APPLICATION DATA

This application is a continuation of U.S. patent application Ser. No.14/712,895, filed May 14, 2015, now allowed, which claims the benefit ofU.S. Provisional Patent Application Ser. No. 62/092,827, filed Dec. 16,2014, both of which are hereby incorporated by reference for allpurposes.

FIELD

The invention pertains to storage, and more particularly to usingstorage in cluster computing.

BACKGROUND

In-memory cluster computing frameworks are a key component of the moderncomputing era, and provide an economically viable alternative tospecially-built supercomputers. Cluster computing frameworks usecommodity hardware that is easily and cheaply obtained. For example, acluster of personal computers can be networked together to providecomputing power that compares favorably (pricewise, if not in terms ofphysical space) with a supercomputer. But whereas traditional operatingsystems work well with individual personal computers that are notorganized in a cluster, some special software is needed to make acluster of personal computers work together. Apache Spark™, an exampleof such software, is growing quickly, and internet-service companiessuch as Google, Facebook, and Amazon are considering Apache Sparkseriously. (Apache, Apache Spark, and Spark are trademarks of The ApacheSoftware Foundation.) Moreover, SAP®, Cloudera™, MapR™, and Datastax arepursuing their efforts to make new products on top of Apache Sparkframework. (SAP is a registered trademark of SAP SE in the United Statesand other countries. Cloudera is a trademark of Cloudera, Inc. MapR is atrademark of MapR Technologies Inc.)

Apache Spark is well-known for its capability to provide “memory-speed”computations, especially for, but not limited to, iterative, big-dataanalytics and real-time applications. To achieve such a greatperformance improvement compared to existing distributed computingplatforms such as Apache Hadoop™, Apache Spark needs to keep its data inthe memory of the clusters for fast computation in “resilientdistributed dataset” (RDD) format. (Apache Hadoop and Hadoop aretrademarks of The Apache Software Foundation.)

Existing Apache Spark implementations utilize the memory heap space ofJava Virtual Machines (JVM), but this introduces significant performancedegradation due to the needed Garbage Collection (GC) time. The GC eventpauses the whole JVM and thus literally stops the whole execution.

To alleviate such high costs for maintaining RDD in the memory space ofJava, Apache Spark developers came up with another solution, called“Tachyon”. Tachyon utilizes RAMDisks to cache RDD in memory withouttriggering the GC event in the JVM, while also maintaining the filesystem in the memory system. Tachyon not only eliminates GC overhead,but provides better separation between the execution engine (ApacheSpark) and the storage/cache engine (Tachyon), because Tachyon runs as adifferent process and is controlled by a central manager which can alsobe fault-tolerant with other application such as Zookeeper.

But despite such efforts from the Apache Spark community, performancebottlenecks still exist in Apache Spark and Tachyon. By sharing memoryspace in the same memory system, both Apache Spark and Tachyon demandhigh memory bandwidth. Due to this bandwidth sharing, Apache Sparkcannot achieve maximum performance.

Moreover, Tachyon, by itself, does not provide any fault tolerance, butrelies on the fault tolerances of the storage systems that it relies on.This lack of fault tolerance within Tachyon can be a serious problem inthe case where system engineers optimize cluster system configurationsto squeeze the best performance out of the system by mounting non-faulttolerant memory/storage systems for Tachyon implementation.

While the above description focuses on Apache Spark and Tachyon, theproblem with determining which devices cache data can potentially befound in any cluster computing framework.

A need remains to better manage the caching of data in a clustercomputing framework that solves this and other problems.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a prior art node in a cluster computing framework.

FIG. 2 shows a node in a cluster computing framework, according to anembodiment of the invention.

FIG. 3 shows more detail about the node of FIG. 2.

FIG. 4 shows details about the resource monitor of FIG. 2.

FIGS. 5A-5B show a flowchart of a procedure for managing cached data inthe node of FIG. 2, according to an embodiment of the invention.

FIG. 6 shows a flowchart of how the resource monitor of FIG. 2 candetermine the performance information for storage devices, according toan embodiment of the invention.

FIG. 7 shows a flowchart of a procedure for migrating cached data whenthe resource monitor of FIG. 2 determines migration to be appropriate,according to an embodiment of the invention.

FIG. 8 shows a computer system that can operate as a cluster computingframework node, as described above with reference to FIGS. 2-7.

DETAILED DESCRIPTION

In-memory cluster computing allows data caching not only bydata-movement oriented classical caching (temporal and spatiallocalities), but also by allowing programmer-enforced/suggested datacaching. An example of such in-memory cluster computing frameworksystems is Apache Spark. In the Apache Spark JVM, the execution enginecan share its memory location to cache such datasets, which might bestored, for example, in the Java heap space. But this arrangement canresult in a huge performance overhead to maintain such datasets in thememory and to keep it live. Moreover, it is not reliable because theexecution engine's failure would result in data loss. To alleviate theseproblems, separated processes can take care of cached data. For example,Tachyon was proposed by the Apache Spark community to address theseissues. But Tachyon does not solve more fundamental problems, such asbandwidth and capacity sharing. The problem with Tachyon is that theexecution and the storage engine both access the memory: the executionengine to compute, and the storage engine to read/write cached data.This structure not only makes the memory bandwidth a bottleneck, butexacerbates a limited memory space problem as well. Nor are theseproblems limited to Apache Spark and Tachyon: other in-memory clustercomputing frameworks can suffer from similar problems.

FIG. 1 shows a node in some common, existing cluster computingframeworks. A person of ordinary skill in the art will understand thatFIG. 1 is an abstraction of the entire node representation. Thisabstraction simplifies understanding of the operation of the existingcomputing framework and is useful for understanding the advancements ofthis disclosure.

In FIG. 1, compute worker 105 includes execution engine 110, whichexecutes an application. Compute worker 105 interfaces with I/O engine115, which includes cache engine 120. Cache engine 120 is responsiblefor managing the caching of data in memory only. I/O engine 115interfaces with various storage devices 125, which can include memory,Solid State Drives (SSDs), Hard Disk Drives (HDDs), and other storagedevices. I/O engine 115 also interfaces with existing distributedstorage systems 130, which provide distributed storage possibilities.

In existing cluster nodes, there can be many different ways to storedata: for example, Dynamic Random Access Memory (DRAM), Storage ClassMemory (SCM), or other memory, fast SSDs, slow SSDs, fast hard diskdrives (HDDs), slow HDDs, and distributed storage systems. (In someembodiments of the invention, memory is considered to be a storagedevice, even though memory tends to be used differently from otherstorage devices.) Because the various storage devices have differentbandwidth/capacity characteristics, there is no one-size-fits-allsolution. For example, memory caching may be advantageous forcompute-intensive benchmarking with a smaller working set size (memoryfootprint). But alternatively, SSD-caching may be advantageous for anI/O-intensive benchmark, or a benchmark with a large persistent RDD(programmer-directed cached data). For example, PageRank on Apache Sparkcan benefit from caching RDDs in a high-performance SSD. As a result,embodiments of the invention can include a resource- and demand-awaremechanism to determine the best-performing storage device in anin-memory cluster computing environment, so as to select thebest-performing storage device given the application being executed.Such a mechanism can also select the best method and storage device toprovide fault tolerance as well.

Embodiments of the invention include a resource- and demand-awarecaching mechanism as an intermediate layer between the execution engineand the distributed storage system to cache data such as RDD in ApacheSpark. The execution engine sends data read/write request to the I/Oengine for this caching layer and the I/O engine responds to theserequests. Embodiments of the invention introduce new components, such asa resource monitor and a resource selector.

FIG. 2 shows a node in a cluster computing framework, according to anembodiment of the invention. In FIG. 2, compute worker 105 and existingdistributed storage systems 130 are unchanged from FIG. 1. But I/Oengine 115 from FIG. 1 now includes resource monitor 205 and resourcescheduler 210. Together, resource monitor 205 and resource scheduler 210can be considered as caching mechanism 215. Resource monitor 205 canmonitor the operation of various storage devices available to the node.Resource scheduler 210 takes the information determined by resourcemonitor 205 and can decide whether the current devices provide adequatecaching support for the data stored thereon and for potential futurecaching requests. If resource scheduler 210 decides that the currentdevices do not provide adequate caching support, then resource scheduler210 can change what data is cached on what device.

In FIG. 2, memory 220, SCM 225, SSD 230 and HDD 235 are identifiedseparately, as each can be used to cache data for applications. The useof other forms of storage is also possible. SCM is a new variety ofstorage, designed to bridge the gap between memory, such as DRAM, andother storage devices, such as SSDs and HDDs. SCM is designed to haveperformance characteristics similar to DRAM, but with cost and capacitycloser to HDDs. In addition, although a single storage device could beused to cache all data for all applications, it is also possible fordifferent storage devices to be used to cache data for differentprograms. Thus, memory 220 might cache a first data, SCM 225 might cachea second data, SSD 230 might cache a third data, and HDD 235 might cachea fourth data.

Resource monitor 205 and resource selector 210 can recognize differentdevices for caching. For a simple computation, information about devicescan be kept in a sorted order. The sorting metric can be bandwidth,latency, capacity, etc. For example, sorting devices based on latency,the sorted order might be: memory 220>SCM 225>SSD 230>local HDD235>distributed storage 130. But a person skilled in the art willrecognize that more complicated computations can be used, with differentsorting metrics. Resource monitor 205 can determine the performancecharacteristics for the storage devices, and resource selector 210 canthen prioritize the storage devices based on whatever sorting metric ischosen. Resource selector 210 can select the sorting metric, or thesorting metric can be chosen by another component of the system, such asthe operating system.

The performance characteristics of the storage devices and the sortingmetric can be determined in advance, since the sorting metric typicallyis independent of the actual use of the storage device. For example,latency depends on how long it takes to access data, which does notdepend on how much or how little data is stored on the storage device. Aperson skilled in the art will recognize other possible sorting metrics.For example, bandwidth might be used as a sorting metric, where devicesthat can provide greater throughput of data are considered higherpriority.

The performance characteristics can also be determined by analyzing theperformance of the storage device during run-time. For example, resourcemonitor 205 can track the operations of memory 220, SCM 225, SSD, 230,local HDD 235, and distributed storage 130, along with any other storagedevices that might be available to the cluster computing node todetermine their maximum bandwidth and their bandwidth utilization.Resource monitor 205 can determine bandwidth utilization, for example,by monitoring how much data moves to and from the storage device for agiven interval of time. As a more specific example, resource monitor 205might monitor a storage device for 5 ms and detect 1 MB of data beingsent to or from the device. From this, resource monitor 205 cancalculate the average measured bandwidth as 200 MB/sec. By comparingthis calculation with the maximum available bandwidth for the device,resource monitor 205 can calculate a bandwidth utilization percentage.Resource monitor 205 can perform this analysis at intervals to track theoverall bandwidth utilization rate of the storage devices over time, andif a storage device is approaching its bandwidth limits, resourcemonitor 205 can indicate that data should be migrated to another storagedevice. (Resource monitor 205 can also perform continuous analysis ofthe bandwidth of the storage devices, rather than periodic analysis.)

As another example, resource monitor 205 can determine latency bycalculating how long a device takes between receiving a data request andreturning the result of the data request. Averaging such calculationsover a number of data requests can provide a reasonable estimate of thelatency of the storage device.

Resource monitor 205 can also test the storage devices. For example,resource monitor 205 can wait until one or more of the storage devicesis not being utilized (or is minimally utilized), as might occur duringovernight hours. Resource monitor 205 can then test the storage devicesto compute the performance characteristics of the storage devices withminimal impact to user operations. For example, resource monitor 205 canwrite a very large file to measure how long it take and determine thebandwidth of the storage device. Or, resource monitor 205 can requestthe storage device to read a particular address and measure how long thestorage device takes to respond (thereby measuring its latency). Or,resource monitor 205 can request the storage device to advise how muchdata it currently stores: relative to the overall capacity of thestorage device, this calculation can measure the storage device'sfullness.

FIG. 2 also shows priorities for the various storage devices. Memory 220has priority 240, SCM 225 has priority 245, SSD 230 has priority 250,HDD 235 has priority 255, and distributed storage system 130 haspriority 260. Each storage device, and each storage device class, canhave its own priority, based on any desired sorting metric. There can beany number of storage devices or storage device classes: embodiments ofthe claimed invention are not limited to the number of storage devicesshown in FIG. 2. In addition, there is no requirement that differentdevices have different priorities. For example, two devices might havethe same priority using a particular sorting metric. In such asituation, a secondary sorting metric could be used to distinguish amongthe devices, or the system can arbitrarily select one device or anotherwhen data needs to be cached.

In the example cluster computing framework node of FIG. 2, cachingmechanism 215 can start with a static configuration, if one is providedto it. For example, an application might specify a particular storagedevice (or device class) as the preferred caching device. Alternatively,caching mechanism 215 can start with a device of the highest priority(such as memory, when using the latency sorting metric in the exampleabove). When resource monitor 205 detects that a storage device isexperiencing saturation—for example, the available bandwidth of thestorage device is at or approaching its maximum (bandwidth saturation)or the storage device is at or approaching the maximum amount of data itcan store (capacity saturation)—resource selector 210 can change thecaching device to a device with the next highest priority and migratecached data, or requests to cache data, onto it. As noted, resourcescheduler 210 does not need to wait until the storage device is fullysaturated: resource scheduler 210 can migrate data or re-route datacache requests when the storage device is approaching saturation, whichcan be determined, for example, as a threshold percentage (such as 90%)of full saturation. Any desired threshold percentage can be used: theuse of 90% above is merely exemplary. The migrated cached data, orrequests to cache data, can include all data, or just selected data(such as the oldest cached data on the device). If there is no furtherdevice with lower priority, resource selector 210 can continue toworking with the current device. Likewise, caching mechanism 215 candetect when a caching device is under-utilized, and can move data to alower priority device if the usage pattern fits within that lowerdevice's profiles.

Regardless of the source of the configuration information, data storagetypically begins with a device with a priority acceptable to theapplication. So long as the selected storage device can provide adequatecaching support, there is no need to migrate the data or cache requests.Data is migrated if the selected device caching the data does not (ordoes not appear to) provide acceptable levels of service.

According to this disclosure, data and cache requests need notnecessarily migrate from higher priority devices to lower prioritydevices. Resource scheduler 210 can also check to see if a higherpriority device is able to provide adequate service and, if so, canmigrate data/cache requests back to the higher priority device.

As noted above, resource scheduler 210 can migrate not only cached data,but requests to cache new data. In some embodiments of the invention,the fact that one device is considered saturated can have an impact onfuture data caching. For example, if a particular device is sufficientlysaturated that data needs to be migrated off the device, that factsuggests that the device might still be saturated in the foreseeablefuture. Therefore, future data should not be cached on the device. Butin other embodiments of the invention, the fact that the resourcescheduler migrates data from one device to another does not impact theselection of an initial device to cache future data. That is, theselection of the initial device to cache future data may not depend onwhether a device was considered saturated, and a device that waspreviously considered to be saturated can still be selected to cache newdata.

Whether there is a causal relationship between migrating cached data offa device and that device's ability to cache data in the future isvariable. For example, if there is one particular dataset that dominatesthe device's capabilities, migrating that dataset off the device mightleave the device sufficiently unsaturated that the device can cachefuture data. On the other hand, if the data stored on the device isfairly uniform in size, a significant percentage of data might have tobe moved off the storage device before the device would become lesssaturated. In that situation, migrating a few datasets off the storagedevice might not improve the saturation of the device, in which casefuture cache requests are likely better directed toward another device.

FIG. 3 shows an example node in the cluster computing framework,according to some embodiments of the invention. In FIG. 3, node 305 isshown, which a person of ordinary skill in the art will recognize istypically one of many nodes in the cluster. Node 305 can includemultiple workers 105, each with its own execution engine 110 forexecuting applications. I/O engine 115 can include resource monitor 205,resource scheduler 210, and replicator 310. As discussed below,embodiments of the claimed invention can include replication of data toprovide a measure of fault tolerance that might otherwise not be presentin the cluster computing framework.

Node 305 can also include CPU 315, which can execute instructions forthe various workers 105, and storage devices such as memory 220, SCD225, SSD 230, and HDD 235.

As described above, resource monitor 205 can determine what thecapabilities are of the various storage devices. By determining thecapabilities of the storage devices, it becomes possible for resourcescheduler 210 to know whether one or more of the storage devices arereaching the limits of its capabilities. Resource monitor 205 candetermine the capabilities of the storage devices in several differentmanners.

As shown in FIG. 4, in one embodiment of the invention, resource monitor205 can include profiler 405. Profiler 405 can access the variousstorage devices and determine their capabilities. There are manydifferent techniques by which profiler 405 can determine thecapabilities of the storage devices. For example, profiler 405 canaccess the storage devices and read their capabilities directly from thedevices, if the devices include such information in electronic form. Or,profiler 405 can determine a model number from the devices, and thenaccess their capabilities off a website on the Internet, or from aninternal storage listing device capabilities. Profiler 405 can alsodetermine the device capabilities and then store those capabilities forfuture reference. Or, profiler 405 can perform read/write operations onthe device to determine its capabilities, possibly measured againsttime. For example, a device's latency can be determined by measuring howmany milliseconds it takes between a read/write command and the resultbeing returned. Or a device's bandwidth can be determined byreading/writing a large amount of data relative to the amount of time ittakes to complete the command. A person of ordinary skill in the artwill recognize that there are other ways in which a device'scapabilities can be determined. A device's capabilities can bedetermined in advance of the use of the node, and accessed from somestorage. In this manner, the capabilities of the devices can bedetermined statically.

In another embodiment of the invention, the capabilities of the devicescan be determined dynamically. In this embodiment, run-time monitor 410can be used. Run-time monitor 410 can monitor the operation of thevarious storage devices during their ordinary operation to determine thecapabilities of the storage devices. For example, run-time monitor 410can measure the time between a request to read/write data from a storagedevice and when the result is returned to determine the latency of thestorage device. Or run-time monitor 410 can measure the time it takes toread/write a large amount of data to determine the bandwidth of thedevice.

In addition, in some embodiments of the invention, run-time monitor 410can be used to determine the current operation of the storage devices.That is, instead of determining, for example, the bandwidth of a device,run-time monitor 410 can determine the current bandwidth of the devicebeing used. This measurement enables the resource scheduler to determinewhether or not cached data, or requests to cache data, need to bemigrated from one storage device to another. While this exampleconsiders the bandwidth saturation of the storage device, a personskilled in the art will recognize that any capability of the device canbe measured: for example, the capacity saturation of the device (i.e.,how much data the device is currently storing).

Resource selector 210 of FIG. 2 can also provide resource-aware faulttolerance. Resource selector 210 can provide redundant data copies viaother devices. In addition, resource selector 210 can provide redundantcopies in the same node or in other cluster nodes. Resource selector 210can provide a configurable redundant ratio via a replication factor.Resource selector 210 can also provide a configurable frequency viacheck-point interval parameters in the case where check-pointing wasselected as a fault tolerance method. Check-pointed RDDs can also bereplicated via a replication factor if the user specifies the redundantfactor along with check-pointing interval. In this case, resourceselector 210 can provide fault tolerance redundantly (replication andcheck-pointing).

To provide some examples, if DRAM is selected as a caching device,resource selector 210 can provide fault tolerance based oncheck-pointing to the next (non-volatile) device (i.e. PRAM, SCD, SSD,HDD, distributed storage, or the like). If SSD is selected as a cachingdevice, resource selector 210 can provide fault tolerance based eitheron replication to other SSDs, check-pointing to other device types, orboth. If HDD is selected as a caching device, resource selector 210 canprovide fault tolerance based either on replication to other HDDs,check-pointing to other device types, or both. In addition, in all ofthese examples the replication or check-pointing can be done to storagedevices on the same cluster node or on a different cluster node. Havingredundant data across different cluster nodes enables protection againstnode failure and, if the nodes are on different server racks, protectionagainst rack-power failure. Where fault tolerance is provided ondifferent cluster nodes, resource selector 210 tasks in the variouscluster nodes can communicate with each other to provide inter-nodereplication and check-pointing.

FIGS. 5A-5B show a flowchart of a procedure for managing cached data inthe node of FIG. 2, according to an embodiment of the invention. In FIG.5A, at block 505, resource monitor 205 determines the capabilities ofone or more storage devices on the cluster node. At block 510, resourcescheduler 210 caches data on a storage device. At block 515, replicator310 replicates the data on a second storage device, for fault tolerance.

At block 520 (FIG. 5B) resource monitor 205 monitors the performance ofthe storage device caching the data. As described above, there can bemore than one storage device caching data: at block 520, any or allstorage devices caching data can be monitored. In addition, storagedevices not currently caching data can also be monitored. By monitoringother storage devices, resource scheduler 210 can select an appropriatestorage device for data migration/request re-direction if a storagedevice caching data becomes saturated. At block 525, resource scheduler210 determines if the storage device caching the data is saturated (orapproaching saturation). If so, then at block 530 resource scheduler 210migrates the data to another cached device, and at block 535 resourcescheduler 210 can re-direct future cache requests destined for the firststorage device to the third storage device.

In FIG. 5 (and in the other flowcharts below), one embodiment of theinvention is shown. But a person skilled in the art will recognize thatother embodiments of the invention are also possible, by changing theorder of the blocks, by omitting blocks, or by including links not shownin the drawings. For example, block 515 can be omitted to eliminate thefault tolerance of the system, but maintain the caching operations. Orafter block 535, control can return to block 520 for more monitoring ofthe devices. All such variations of the flowcharts are considered to beembodiments of the invention, whether expressly described or not.

There is an interesting interplay between how the resource schedulerhandles data migration when a storage device becomes saturated and howthe resource scheduler handles data replication for fault tolerance.When data is replicated, in some embodiments it is replicated to astorage device that has a priority no higher than the storage devicethat provides the caching service. But when the resource schedulermigrates data from a higher priority device to a lower priority device,the replicated data might now be resident on a device with a higherpriority than the device now caching the data.

There are two ways to address this situation. One solution is to donothing: the replication is simply to provide fault tolerance, and thefact that the data is replicated on a higher priority device than thecached data is simply a curious artifact. (In fact, fault tolerance doesnot necessarily require replication on a lower priority device: there isno reason why data replication could be performed onto any availabledevice, regardless of priority.) The other solution is to migrate thereplicated data to ensure that the replicated data does not have ahigher priority than the cached data.

FIG. 6 shows a flowchart of how resource monitor 205 of FIG. 2 candetermine the performance information for storage devices, according toan embodiment of the invention. In FIG. 6, at block 605, resourcemonitor 205 can access performance information for the storage device.Resource monitor 205 can access the performance information from thestorage device or from some accessible storage, either local ornetworked. Alternatively, at block 610, resource monitor 205 can run aprofiler on the device to determine the device's performanceinformation. Alternatively, at block 615, resource monitor 205 candetermine the device's performance information from monitoring theoperation of the storage device in real-time.

The various ways to determine performance information shown in FIG. 6are not mutually exclusive. For example, the storage device mightprovide performance information as in block 605, but the system mightalso periodically run a profiler on the storage device as in block 610,to ensure that the storage device is still performing to specification.

FIG. 7 shows a flowchart of a procedure for migrating cached data whenthe resource monitor of FIG. 2 determines migration to be appropriate,according to an embodiment of the invention. In FIG. 7, at block 705,resource scheduler 210 can migrate all data from the storage device thatis at (or near) saturation to another device. Migration can includecopying the data from one storage device to another, and then deletingthe migrated data from the first storage device. Alternatively, at block710, resource scheduler 210 can migrate selected data from the storagedevice to another device. Which data is selected for migration can bedetermined by resource scheduler 210 as appropriate: for example,resource scheduler 210 might select the oldest data resident on thestorage device for migration, or resource scheduler 210 might select thelargest data file(s) resident on the storage device for migration.

FIG. 8 shows a computer system that can operate as a cluster computingframework node, as described above with reference to FIGS. 2-7. In FIG.8, computing system 805 can also include a clock 810, random accessmemory (RAM) 815, user interface 820, solid state drive/disk (SSD) 230,network connector 825, such as an Ethernet connector, processor 315,and/or memory controller 830, any or all of which may be electricallycoupled to system bus 835. I/O engine 115 can correspond to thosedescribed in detail above, and as set forth herein, and may also beelectrically coupled to the system bus 835. I/O engine 115 can includeor otherwise interface with clock 810, random access memory (RAM) 815,user interface 820, solid state drive/disk (SSD) 230, network connector825, processor 315, and/or memory controller 830.

Embodiments of the invention can extend to the following statements,without limitation:

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; and a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor a bandwidth saturation of the first storage device; and aresource scheduler operative to migrate the cached data from the firststorage device to the second storage device if the resource monitorindicates that the first storage device is saturated.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor a capacity saturation of the first storage device; and aresource scheduler operative to migrate the cached data from the firststorage device to the second storage device if the resource monitorindicates that the first storage device is saturated.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative todetermine the capabilities of the first storage device and to monitorthe first storage device; and a resource scheduler operative to migratethe cached data from the first storage device to the second storagedevice if the resource monitor indicates that the first storage deviceis saturated.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative todetermine the capabilities of the first storage device and to monitorthe first storage device; and a resource scheduler operative to migratethe cached data from the first storage device to the second storagedevice if the resource monitor indicates that the first storage deviceis saturated, wherein the resource monitor is operative to accessperformance information from the first storage device.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative todetermine the capabilities of the first storage device and to monitorthe first storage device; and a resource scheduler operative to migratethe cached data from the first storage device to the second storagedevice if the resource monitor indicates that the first storage deviceis saturated, wherein the resource monitor includes a profiler toprofile the first storage device.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative todetermine the capabilities of the first storage device using a run-timemonitor and to monitor the first storage device; and a resourcescheduler operative to migrate the cached data from the first storagedevice to the second storage device if the resource monitor indicatesthat the first storage device is saturated.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; and a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated, wherein the resource scheduler is operative toselect the first storage device to initially cache the data based oninformation provided by an application that uses the data.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; and a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated, wherein the first priority is higher than thesecond priority and the resource scheduler is operative to select thefirst storage device to initially cache the data as a higher prioritydevice.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; and a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated, wherein the first priority is higher than thesecond priority, and the resource scheduler is operative to select thefirst storage device to initially cache the data as a higher prioritydevice and to select the second storage device for future data cachingif the resource monitor indicates that the first storage device issaturated.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated; and a replicator to replicate the cached data on athird storage device.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated; and a replicator to replicate the cached data on athird storage device having a third priority, wherein the third priorityis the same as the first priority.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated; and a replicator to replicate the cached data on athird storage device having a third priority, wherein the third priorityis lower than the first priority.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated; and a replicator to replicate the cached data on athird storage device having a third priority, the third storage deviceis in a second in-memory cluster computing framework node.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated; and a replicator to replicate the cached data on athird storage device having a third priority, the third storage deviceis in a second in-memory cluster computing framework node, wherein thethird priority is the same as the first priority.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated; and a replicator to replicate the cached data on athird storage device having a third priority, the third storage deviceis in a second in-memory cluster computing framework node, wherein thethird priority is lower than the first priority.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated; and a replicator to replicate the cached data on athird storage device having a third priority, the third storage deviceis in a second in-memory cluster computing framework node, wherein thethird storage device is specified by a user.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; and a resource scheduler operative tomigrate the cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated, wherein the data includes a resilient distributeddataset (RDD) on the first storage device.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; and a resource scheduler operative tomigrate all cached data from the first storage device to the secondstorage device if the resource monitor indicates that the first storagedevice is saturated.

An embodiment of the invention includes an in-memory cluster computingframework node, comprising: a processor; a first storage device storingcached data, the first storage device having a first priority; a secondstorage device having a second priority; a resource monitor operative tomonitor the first storage device; and a resource scheduler operative tomigrate an oldest cached data from the first storage device to thesecond storage device if the resource monitor indicates that the firststorage device is saturated.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device; and if the first storagedevice is saturated, migrating the cached data to a second storagedevice with a second priority.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device including monitoring abandwidth saturation of the first storage device; and if the firststorage device is saturated, migrating the cached data to a secondstorage device with a second priority.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device including monitoring acapacity saturation of the first storage device; and if the firststorage device is saturated, migrating the cached data to a secondstorage device with a second priority.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device including determining acapability of the first storage device; and if the first storage deviceis saturated, migrating the cached data to a second storage device witha second priority.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device including determining acapability of the first storage device including accessing performanceinformation from the first storage device; and if the first storagedevice is saturated, migrating the cached data to a second storagedevice with a second priority.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device including determining acapability of the first storage device including running a profiler onthe first storage device; and if the first storage device is saturated,migrating the cached data to a second storage device with a secondpriority.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device including determining acapability of the first storage device including determining currentbandwidth for the first storage device from run-time monitoring; and ifthe first storage device is saturated, migrating the cached data to asecond storage device with a second priority.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node, the firststorage device selected by an application using the data; monitoring theoperation of the first storage device; and if the first storage deviceis saturated, migrating the cached data to a second storage device witha second priority.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node, the firststorage device having a higher priority among a plurality of devices;monitoring the operation of the first storage device; and if the firststorage device is saturated, migrating the cached data to a secondstorage device with a second priority.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device; if the first storage deviceis saturated, migrating the cached data to a second storage device witha second priority in the cluster node; and if the first storage deviceis saturated, re-directing future cache requests for the first storagedevice in the cluster node in the cluster node to the second storagedevice.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device; if the first storage deviceis saturated, migrating the cached data to a second storage device witha second priority; and replicating the cached data on a third storagedevice.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device; if the first storage deviceis saturated, migrating the cached data to a second storage device witha second priority; and replicating the cached data on a third storagedevice, the third storage device having a same priority as the firststorage device.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device; if the first storage deviceis saturated, migrating the cached data to a second storage device witha second priority; and replicating the cached data on a third storagedevice, the third storage device having a lower priority than the firststorage device.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device; if the first storage deviceis saturated, migrating the cached data to a second storage device witha second priority; and replicating the cached data on a third storagedevice in a second cluster node.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device; if the first storage deviceis saturated, migrating the cached data to a second storage device witha second priority; and replicating the cached data on a third storagedevice in a second cluster node, the third storage device having a samepriority as the first storage device.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device; if the first storage deviceis saturated, migrating the cached data to a second storage device witha second priority; and replicating the cached data on a third storagedevice in a second cluster node, the third storage device having a lowerpriority than the first storage device.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device; if the first storage deviceis saturated, migrating the cached data to a second storage device witha second priority; and replicating the cached data on a third storagedevice as specified by a user.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a resilientdistributed dataset (RDD) on a first storage device with a firstpriority in a cluster node; monitoring the operation of the firststorage device; and if the first storage device is saturated, migratingthe cached data to a second storage device with a second priority.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device; and if the first storagedevice is saturated, migrating all cached data to a second storagedevice with a second priority.

An embodiment of the invention includes a method for caching data in anin-memory cluster computing framework, comprising: caching a data on afirst storage device with a first priority in a cluster node; monitoringthe operation of the first storage device; and if the first storagedevice is saturated, migrating an oldest cached data to a second storagedevice with a second priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device; and if the first storage device is saturated, migratingthe cached data to a second storage device with a second priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device including monitoring a bandwidth saturation of the firststorage device; and if the first storage device is saturated, migratingthe cached data to a second storage device with a second priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device including monitoring a capacity saturation of the firststorage device; and if the first storage device is saturated, migratingthe cached data to a second storage device with a second priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device including determining a capability of the first storagedevice; and if the first storage device is saturated, migrating thecached data to a second storage device with a second priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device including determining a capability of the first storagedevice including accessing performance information from the firststorage device; and if the first storage device is saturated, migratingthe cached data to a second storage device with a second priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device including determining a capability of the first storagedevice including running a profiler on the first storage device; and ifthe first storage device is saturated, migrating the cached data to asecond storage device with a second priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device including determining a capability of the first storagedevice including determining current bandwidth for the first storagedevice from run-time monitoring; and if the first storage device issaturated, migrating the cached data to a second storage device with asecond priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node, the first storage device selected byan application using the data; monitoring the operation of the firststorage device; and if the first storage device is saturated, migratingthe cached data to a second storage device with a second priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node, the first storage device having ahigher priority among a plurality of devices; monitoring the operationof the first storage device; and if the first storage device issaturated, migrating the cached data to a second storage device with asecond priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device; if the first storage device is saturated, migrating thecached data to a second storage device with a second priority; andreplicating the cached data on a third storage device.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device; if the first storage device is saturated, migrating thecached data to a second storage device with a second priority; andreplicating the cached data on a third storage device, the third storagedevice having a same priority as the first storage device.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device; if the first storage device is saturated, migrating thecached data to a second storage device with a second priority; andreplicating the cached data on a third storage device, the third storagedevice having a lower priority than the first storage device.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device; if the first storage device is saturated, migrating thecached data to a second storage device with a second priority; andreplicating the cached data on a third storage device in a secondcluster node.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device; if the first storage device is saturated, migrating thecached data to a second storage device with a second priority; andreplicating the cached data on a third storage device in a secondcluster node, the third storage device having a same priority as thefirst storage device.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device; if the first storage device is saturated, migrating thecached data to a second storage device with a second priority; andreplicating the cached data on a third storage device in a secondcluster node, the third storage device having a lower priority than thefirst storage device.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device; if the first storage device is saturated, migrating thecached data to a second storage device with a second priority; andreplicating the cached data on a third storage device as specified by auser.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a resilient distributed dataset (RDD) ona first storage device with a first priority in a cluster node;monitoring the operation of the first storage device; and if the firststorage device is saturated, migrating the cached data to a secondstorage device with a second priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device; and if the first storage device is saturated, migratingall cached data to a second storage device with a second priority.

An embodiment of the invention includes an article, comprising atangible storage medium, said tangible storage medium having storedthereon non-transitory instructions that, when executed by a machine,result in, comprising: caching a data on a first storage device with afirst priority in a cluster node; monitoring the operation of the firststorage device; and if the first storage device is saturated, migratingan oldest cached data to a second storage device with a second priority.

The following discussion is intended to provide a brief, generaldescription of a suitable machine or machines in which certain aspectsof the inventive concept can be implemented. Typically, the machine ormachines include a system bus to which is attached processors, memory,e.g., random access memory (RAM), read-only memory (ROM), or other statepreserving medium, storage devices, a video interface, and input/outputinterface ports. The machine or machines can be controlled, at least inpart, by input from conventional input devices, such as keyboards, mice,etc., as well as by directives received from another machine,interaction with a virtual reality (VR) environment, biometric feedback,or other input signal. As used herein, the term “machine” is intended tobroadly encompass a single machine, a virtual machine, or a system ofcommunicatively coupled machines, virtual machines, or devices operatingtogether. Exemplary machines include computing devices such as personalcomputers, workstations, servers, portable computers, handheld devices,telephones, tablets, etc., as well as transportation devices, such asprivate or public transportation, e.g., automobiles, trains, cabs, etc.

The machine or machines can include embedded controllers, such asprogrammable or non-programmable logic devices or arrays, ApplicationSpecific Integrated Circuits (ASICs), embedded computers, smart cards,and the like. The machine or machines can utilize one or moreconnections to one or more remote machines, such as through a networkinterface, modem, or other communicative coupling. Machines can beinterconnected by way of a physical and/or logical network, such as anintranet, the Internet, local area networks, wide area networks, etc.One skilled in the art will appreciate that network communication canutilize various wired and/or wireless short range or long range carriersand protocols, including radio frequency (RF), satellite, microwave,Institute of Electrical and Electronics Engineers (IEEE) 802.11,Bluetooth®, optical, infrared, cable, laser, etc.

Embodiments of the present inventive concept can be described byreference to or in conjunction with associated data including functions,procedures, data structures, application programs, etc. which whenaccessed by a machine results in the machine performing tasks ordefining abstract data types or low-level hardware contexts. Associateddata can be stored in, for example, the volatile and/or non-volatilememory, e.g., RAM, ROM, etc., or in other storage devices and theirassociated storage media, including hard-drives, floppy-disks, opticalstorage, tapes, flash memory, memory sticks, digital video disks,biological storage, etc. Associated data can be delivered overtransmission environments, including the physical and/or logicalnetwork, in the form of packets, serial data, parallel data, propagatedsignals, etc., and can be used in a compressed or encrypted format.Associated data can be used in a distributed environment, and storedlocally and/or remotely for machine access.

Embodiments of the inventive concept can include a tangible,non-transitory machine-readable medium comprising instructionsexecutable by one or more processors, the instructions comprisinginstructions to perform the elements of the inventive concepts asdescribed herein.

Having described and illustrated the principles of the inventive conceptwith reference to illustrated embodiments, it will be recognized thatthe illustrated embodiments can be modified in arrangement and detailwithout departing from such principles, and can be combined in anydesired manner. And, although the foregoing discussion has focused onparticular embodiments, other configurations are contemplated. Inparticular, even though expressions such as “according to an embodimentof the inventive concept” or the like are used herein, these phrases aremeant to generally reference embodiment possibilities, and are notintended to limit the inventive concept to particular embodimentconfigurations. As used herein, these terms can reference the same ordifferent embodiments that are combinable into other embodiments.

The foregoing illustrative embodiments are not to be construed aslimiting the inventive concept thereof. Although a few embodiments havebeen described, those skilled in the art will readily appreciate thatmany modifications are possible to those embodiments without materiallydeparting from the novel teachings and advantages of the presentdisclosure. Accordingly, all such modifications are intended to beincluded within the scope of this inventive concept as defined in theclaims.

Consequently, in view of the wide variety of permutations to theembodiments described herein, this detailed description and accompanyingmaterial is intended to be illustrative only, and should not be taken aslimiting the scope of the invention. What is claimed as the invention,therefore, is all such modifications as may come within the scope andspirit of the following claims and equivalents thereto.

What is claimed is:
 1. An in-memory cluster computing framework node,comprising: a processor; a first storage device storing cached data, thefirst storage device having a first priority ranking the first storagedevice according to at least one metric; a second storage device havinga second priority ranking the second storage device according to the atleast one metric; a resource monitor operative to monitor the firststorage device; and a resource scheduler operative to migrate the cacheddata from the first storage device to the second storage device if theresource monitor indicates that the first storage device is approachinga performance characteristic limit according to the at least one metric,wherein the first priority for the first storage device and the secondpriority for the second storage device are determined without referenceto an application and the application's data.
 2. The in-memory clustercomputing framework node according to claim 1, wherein the resourcemonitor is operative to determine the capabilities of the first storagedevice.
 3. The in-memory cluster computing framework node according toclaim 1, wherein the resource scheduler is operative to select the firststorage device to initially cache the data based on information providedby an application that uses the data.
 4. The in-memory cluster computingframework node according to claim 1, wherein: the first priority ishigher than the second priority; and the resource scheduler is operativeto select the first storage device to initially cache the data as ahigher priority device.
 5. The in-memory cluster computing frameworknode according to claim 4, wherein the resource scheduler is operativeto select the second storage device for future data caching if theresource monitor indicates that the first storage device is approachingthe performance characteristic limit according to the at least onemetric.
 6. The in-memory cluster computing framework node according toclaim 1, further comprising a replicator to replicate the cached data ona third storage device.
 7. The in-memory cluster computing frameworknode according to claim 6, wherein the third storage device is in asecond in-memory cluster computing framework node.
 8. The in-memorycluster computing framework node according to claim 1, wherein the dataincludes a resilient distributed dataset (RDD) on the first storagedevice.
 9. The in-memory cluster computing framework node according toclaim 1, wherein the resource scheduler is operative to migrate all datafrom the first storage device to the second storage device.
 10. Thein-memory cluster computing framework node according to claim 1, whereinthe resource scheduler is operative to migrate an oldest data from thefirst storage device to the second storage device.
 11. The in-memorycluster computing framework node according to claim 1, wherein the atleast one metric is drawn from a set including latency and bandwidth.12. A method for caching data in an in-memory cluster computingframework, comprising: caching a data on a first storage device with afirst priority in a cluster node as a cached data, the first priorityranking the first storage device according to at least one metric;monitoring the operation of the first storage device; and if the firststorage device is approaching a performance characteristic limitaccording to the at least one metric, migrating the cached data to asecond storage device with a second priority, the second priorityranking the second storage device according to the at least one metric,wherein the first priority for the first storage device and the secondpriority for the second storage device are determined without referenceto an application and the application's data.
 13. The method accordingto claim 12, wherein monitoring the operation of the first storagedevice includes determining a capability of the first storage device.14. The method according to claim 12, wherein caching a data on a firststorage device with a first priority in a cluster node includes cachingthe data on the first storage device in the cluster node, the firststorage device selected by an application using the data.
 15. The methodaccording to claim 12, wherein caching a data on a first storage devicewith a first priority in a cluster node includes caching the data on thefirst storage device in the cluster node, the first storage devicehaving a higher priority among a plurality of devices.
 16. The methodaccording to claim 15, further comprising, if the first storage deviceis approaching the performance characteristic limit according to the atleast one metric, re-directing future cache requests for the firststorage device in the cluster node in the cluster node to the secondstorage device.
 17. The method according to claim 12, further comprisingreplicating the cached data on a third storage device.
 18. The methodaccording to claim 17, wherein replicating the cached data on a thirdstorage device includes replicating the cached data on the third storagedevice in a second cluster node.
 19. The method according to claim 12,wherein caching a data on a first storage device with a first priorityin a cluster node includes caching a resilient distributed dataset (RDD)on the first storage device.
 20. The method according to claim 12,wherein migrating the cached data to a second storage device with asecond priority includes migrating all data on the first storage deviceto the second storage device.
 21. The method according to claim 12,wherein migrating the cached data to a second storage device with asecond priority includes migrating an oldest data on the first storagedevice to the second storage device.
 22. The method according to claim12, wherein the at least one metric is drawn from a set includinglatency and bandwidth.